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Abstract. The p\ model is a directed random graph model used to describe dyadic inter- 
actions in a social network in terms of effects due to differential attraction (popularity) and 
expansiveness, as well as an additional effect due to reciprocation. In this article we carry 
out an algebraic statistics analysis of this model. We show that the pi model is a toric model 
specified by a multi-homogeneous ideal. We conduct an extensive study of the Markov bases 
for pi models that incorporate explicitly the constraint arising from multi-homogeneity. Our 
results are directly relevant to the estimation and conditional goodness-of-fit testing prob- 
lems in p\ models. 

1. Introduction 

The study of random graphs is an active area of research in many fields, including math- 
ematics, probability, biology and social sciences. In the social network literature, the nodes 
of the network represent population units and the edges represent relationships among indi- 
viduals. Thus, prescribing a probability distribution over a network is essentially a way of 
encoding the dynamic of interactions and behavior among individuals in a given population. 
For a review, see Goldenberg et al. [12] . 

One of the earlier and most popular statistical models for social networks is the p\ model 
of Holland and Leinhardt, described in their 1981 seminal paper [15]. In a p\ model the 
network is represented as a digraph in which, for every pair of nodes {i, j}, or dyad, one can 
observe one of four possible dyadic configurations: a directed edge from i to j, a directed 
edge from j to i, two directed edges, or no edges between i and j. Each dyad is in any of 
these configurations independently of all the other dyads. Quoting from [15] , the probability 
distribution over the entire network depends on various parameters quantifying the "attrac- 
tiveness" and "productivity" of the individual nodes, the "strength of reciprocity" of each 
dyad and the network "density," or overall tendency to have edges. See the next section for 
details. 

Despite its simplicity and interpretability, and despite being a special case of the broader 
and well studied class of log-linear models, e.g., see [TUlfTI] . the p\ model poses extraordinary 
challenges of both theoretical and practical nature. Indeed, first and foremost, the number 
of network parameters depends on the size of the network, so that, as the population size 
grows, the model complexity also increases. This remarkable feature renders the p\ model, 
as well as virtually any other network model, unlike traditional statistical models (including 
log-linear models), whose complexity is fixed and independent of the sample size. Second, 
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as assumed by Holland and Leinhardt in their original paper [T5], and as is customary in 
social network analysis, statistical inference for the p\ model is typically made on the basis of 
one realization of the network, which amounts to one observation per dyad. Because of the 
unavoidable sparsity and low information content of the observable data, and because of the 
dependence of the model complexity on the sample size, classical statistical procedures for 
estimation and goodness-of-flt testing in p\ models may be inadequate and exhibit properties 
that are poorly understood. 

In this article, we revisit the Holland-Leinhardt p\ model using the tools and language of 
algebraic geometry, as is customary in the maturing field of algebraic statistics. See, e.g., [7], 
[T7| and [8]. For the class of p\ models at hand, we identify the probabilities of each of the 
2n(n — 1) possible dyadic configurations with the indeterminates of a multi-homogeneous 
toric ideal. Our first goal is to investigate the Markov bases for p\ by computing a minimal 
generating set of this ideal first and then by removing basis elements that violate the condi- 
tion of one observation per dyad. To our knowledge, this is the first time in the Markov bases 
literature that sampling constraints of this form, known in statistics as product Multinomial 
sampling schemes, have been directly incorporated in the study of Markov bases. Our results 
prescribe a set of rules for creating Markov moves that are applicable to network data and 
offer significant efficiency improvements over existing methods for generating Markov bases. 
Our second goal is to study the toric variety associated to the pi ideal and to relate its 
geometric properties with the conditions for existence of the maximum likelihood estimator 
of the dyadic probabilities. 

The paper is organized as follows. In Section [2] we describe in detail the p\ model, its 
statistical properties and the statistical tasks of estimation and testing that motivate our 
work. In particular, we consider three different flavors of the p\ model, of increasing com- 
plexity. In Section [3] we embark on an exhaustive study of the Markov bases for p\ models. 
As we mentioned above, the constraint that there is only one observation per dyad makes 
our analysis unusually difficult. 

2. The Holland-Leindhardt p 1 model 

We consider a directed graph on the set of n nodes. The nodes correspond to units in a 
network, such as individuals, and the edges correspond to links between the units. We focus 
on dyadic pairings and keep track of whether node i sends an edge to j, or vice versa, or 
none, or both. Let Pij(l, 0) be the probability of % sending an edge toward j, and let Pij(0, 1) 
the probability of j sending an edge toward % (thus, 1 denotes the outgoing side of the edge). 
Further, Pij(l, 1) is the probability of i sending an edge to j and j sending an edge to i, 
while Pij(0, 0) is the probability that there is no edge between i and j. Thus, 

(2.1) ptf(0,0) +p fi (l,0) +Pii(0,l) +Pv(l,l) = 1, 

for each of the ("J pairs of nodes {i,j}- The Holland-Leinhardt p\ model is given as follows 
(see [IS]): 

log p^ (0,0) = Ay 
(9 9 n logPii(l, 0) = Xij + a,i + /3j + e 

K ] logpy(0,l) = + + A + 

logpy(l, 1) = + a.i + (3j + ctj + Pi + 28 + p i: j. 
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The real-valued variables 9, fy, p^ and Ay for all % < j are the model parameters. The 
parameter on controls the effect of an outgoing edge from i, the parameter f3j the effect of an 
incoming edge into j, and py the added effect of reciprocated edges (in both directions). The 
"density" parameter 9 quantifies the overall propensity of the network to have edges. Finally, 



Xij is just a normalizing constant to ensure that (2.1 ) holds for each each dyad {i,j}- See the 
original paper on the p± model [15] for an extensive interpretation of the model parameters. 

We take note here of a basic, yet a rather fundamental feature of our settings that apply 
to pi models as well as to many other statistical models for networks: data become available 
in the form of one observed network. Even though for each pair of nodes {i,j}, the four 



probabilities Py(», •) are strictly positive according to the defining equations (2.2), each dyad 
can be observed in one and only one of the 4 possible states. Thus, despite the richness and 
complexity of pi models, their statistical analysis is severely limited by the disproportionally 
small information content in the data. This fact makes p\ and, more generally, network 
models challenging, and it greatly affects our analysis, as we will show. 
We study the following special cases of the general p\ structure: 

(1) = 0: no reciprocal effect. 

(2) p^ = p: constant reciprocation. 

(3) p^ = p + pi + pj\ edge-dependent reciprocation. 

For a network on n nodes, we represent the vector of 2n(n — 1) dyad probabilities as p = 
(Pi2,pi3, • • • ,Pn-i,n) G IR 2 ™^ -1 ), where, for each i <j,Pij = (py(0, 0),pij(l, 0),Py(0, l),Py(l, 1)) G 
A3, with A3 the standard simplex in IR 4 . 

As usual in algebraic statistics, the p\ model is defined to be the set of all candidate 



probability distributions that satisfy the Holland-Leinhardt equations (2.2). By definition, 
the pi model is log-linear; that is, the set of candidate probabilities p is such that logp is in 
the linear space spanned by the rows of a matrix A, which is often called the design matrix 
of the model. Indeed, the design matrix encodes a homomorphism between two polynomial 
rings: <p n : C[py(a, b)] — > C[Ay, a i: 0i, 9, py], with i < j G {1 . . . n} and a,b G {0, 1}, induced 
by 

Pij {a,0J I r AijOti OSjPi Pj V Pij 

where a,b G {0, 1}, and we consider parameters Ay, a*, fy, p^ and 9 for i, j G {1, . . . ,n} 
as unknowns. The design matrix A simply describes the action of tp n on the probabilities 
Pij(», •). The columns of A are indexed by py's and its rows by the model parameters. The 
entries of the design matrix are either or 1; there is a 1 in the (r, c)-entry of the matrix 
if the parameter indexing row r appears in the image of the indexing the column c. For 
example, in the = 0, the matrix A is of size (2n) x (3(™)). We will create the design 

matrices in a systematic way: the rows will always be indexed by Ay, ai, . . . , a n , /3\, . . . , {3 n , 
9, p^ lexicographically in that order. The columns will be ordered in the following way: 
first fix i and j in the natural lexicographic ordering; then, within each set, vary the edge 



directions in this order: (0,0), (1,0), (0,1), (1,1). Examples can be found in section 3.2 
Furthermore, it is easy to see that the design matrix for the network on n nodes will consist 
of several copies of the 2-node matrix, placed as a submatrix of in those rows and columns 
corresponding to all 2-node subnetworks. 
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Let ( = (Cij • • • > Crf) denote the vector of model parameters, whose dimension d depends 
on the type of restrictions on the pi- Then, using the design matrix A, one can readily 
verify that the Holland-Leinhardt equations have the log-linear representation logp = A T £, 
or, equivalently, letting a^i and pk be the (k, I) entry of A and the k-th element of p, 
respectively, 

d 



Pk = JJOfc 



l) 6 . 



1=1 



2.1. Algebraic statistical challenges in p\ models. To provide some context and mo- 
tivation for our analysis, we now briefly review some fundamental statistical tasks for the 
analysis of p\ models. We point out that these problems still remain in part unsolved. 

Denote by Ma the p\ model, that is, the set of points satisfying the Holland-Leinhardt 
equations (2.2). Notice that Ma is a relatively open set in the positive orthant of IR 2n(n_1 ) of 



dimension rank(A). Let X n C IR 2n ( n_1 ) be the sample space, that is, the set of all observable 
networks on n nodes. Notice that \X n \ = A n< ^ n ~ l \ We will write every point x in the sample 
space X as x = (xi 2 , ^13, • • • , aV-i,n)> where each of the (™) subvectors x^ is a vertex of A 3 . 
Probabilistically, each x^ is the realization of a random vector in M 4 having multinomial 
distribution with size 1 and class probabilities p„- G A3, where p° = (Pi 2 ,Pi 3 , • • • ,Pn-i,n) * s 



an unknown vector in Ma- Furthermore, (2.2) implies the multinomial vectors representing 



the dyad configurations are mutually independent. 

Once a network x G X n has been observed, statisticians are interested in the following 
interrelated fundamental problems: 

1) estimation problem: to estimate the unknown vector p° G Ma', 

2) goodness-of-fit problem: to test whether the p\ model Ma can be considered appropriate, 
i.e. whether the estimate produced in part (a) is a good fit to the observed data x. 

To tackle the first problem, the most commonly used method is maximum likelihood 
estimation. Specifically, for a fixed point x G X n , the likelihood function I: Ma — > [0,1] 
given by 

Up) = II ( flPij( k ) Xij{k) 
i<j \k=i 

returns the probability of observing the network x as a function of the multinomial proba- 
bilities p G Ma- The maximum likelihood estimator, or MLE, of p° is 

(2.3) p = argmax pg Ma £ x (p), 

i.e. the vector of multinomial probabilities in Ma that makes x the most likely network to 
have been observed. See [18] for more details on maximum likelihood for this model. 

While the estimation problem is relatively well constrained, the problem of testing for 
goodness of fit has a much broader scope, as it entails testing whether the assumption that 
p° G Ma is itself appropriate. A typical goodness-of-fit testing proceeds through the following 
steps: 

1) Compute the MLE p as defined in ( |2.3| ). 

2) Compute the goodness-of-fit statistic GF{x). This quantity measures how close the MLE 
is to the observed network, or, using a statistical jargon, "how well the model Ma fits the 
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data x." Among the most commonly used statistics are Pearson's x 2 an d the likelihood ratio 
statistic: 

- Eix*)io g (g§), 

i<k fc=l ^ v ; i<k k=i // 

respectively. 

3) Reject the assumption that p° G Ma if the goodness-of-fit statistic used in step (2) is 
statistically large. 

Step (3) above is clearly the most delicate, as there is no generally valid recipe for deciding 
when GF(x) is too large, one of the reasons being that the goodness-of-fit statistic is itself 
a random variable. In many applications, it is customary to declare a model a poor fit if 
GF(x) is greater than a given deterministic threshold (depending on the size of the network 
and the number d of model parameters), which is obtained based on asymptotic approxima- 
tions. Despite their widespread use, for the present problem these approximations have no 
theoretical justification (see, e.g., [13]) and can, in fact, be very poor. 

Markov bases provide an alternative, non-asymptotic approach to performing goodness- 
of-fit tests and model selection. Their use has gained some momentum and popularity in 
recent years. Let t = Ax denote the vector of margins, or sufficient statistics, corresponding 
to the observed network. Let % = {x' G X n : Ax' = t} C X n be the fiber of t. From 
statistical theory (see, e.g., [2]), all the networks in X n belonging to the same fiber will 
produce the same MLE. Thus, from the inferential standpoint, all the networks in the same 
fiber are equivalent. Based on this, if x is the observed network and t the associated sufficient 
statistic, one can decide whether the model Ma provides a good fit if the quantity 

\{x' eT t :GF(x')>GF(x)}\ 

( 2 - 4 ) a x = 

I h\ 

is statistically large. Notice that a x is the fraction of networks in the fiber at t whose 
goodness-of-fit statistic value is larger than the value at the true network x. Heuristically, if 
a x is large, i.e. closer to 1, then p is closer to the observed network than most of the other 
points in the same fiber, thus implying that the model fits really well. 

Despite its appeal, this methodology is rarely feasible due to the high computational costs 
of determining the fiber. Thus, rather than computing the fiber exactly, one can attempt 
to estimate a x by performing a sufficiently long random walk over JFt, as follows. A Markov 
basis for M A is set of vectors M = . . . , f M } C Z 2n( - n ~^ such that 

(i) M C kernel(A); 

(ii) for each observable margin t and each x, x' G %, there exists a sequence (ei, /jj, . . . , (ejv, 
(with iV depending on both x and x'), such that tj G { — 1, 1} for all j and 

N a 

x = x + ^2 tjfij > x + e -»'Ai G Xn for a11 a = !> ■ ■ ■ ' N - 

3=1 J=l 

Notice that Markov bases are not unique. If a Markov basis is available, then, for any 
observable network x with sufficient statistics t = Ax, it is possible to estimate a x by 
walking at random inside the fiber % according to the following algorithm: 

(1) set k=0 and set x old = x; 



6 



SONJA PETROVIC, ALESSANDRO RINALDO, AND STEPHEN E. FIENBERG 



(2) at every step, starting from the current position x old , 

(a) randomly choose a vector /' £ M. and a number e G { — 1,1}; 

(b) if x old + ef G X n , move to x new = x old + ef E X n , otherwise stay at x ncw = x old ; 

(3) if GF(x new ) is larger than GF(x) set k = k+1; 

(4) repeat the steps (2) and (3) K times. 

Provided that the algorithm is run for a sufficiently long time (i.e. K is large enough), 
the number k/K is an accurate estimate of a x . See [7] for details and variants of the basic 
Markov basis algorithm described above. 



3. Markov bases of p x models 

In this section we study the properties of Markov bases for the three versions of the p\ 
model described in section [2J 

Our analysis presents certain aspects that set it apart from most of the existing literature 
on Markov bases. Indeed, the traditional algebraic geometry machinery generates Markov 
bases that are "universal", in the sense of depending only on the design matrix A, and not 
the sample space and its properties. The reason for this is that Markov bases are obtained as 
generating sets toric ideals, and thus they cut out a complex toric variety; while the model 
itself lies in the positive real part of the variety. As a result, Markov bases tend to be very 
large even for design matrices of moderate size. In contrast, as we noted above, the sample 
space for network data is very highly constrained, since each dyad can only be observed 
in one and only one of the four possible configurations. Consequently, many of the basis 
elements of any Markov bases are not applicable, since they violate this basic constraint of 
the data. Thus, once we find the Markov bases, we still need to be careful in identifying 
what elements are relevant to our data and in removing the moves that are not applicable. 
To our knowledge, this is the first time this issue has been addressed in the Markov bases 
literature. 

On the other hand, we are able to decompose every Markov basis element using certain 



"basic" moves (Theorem 3.22), which are, as we will see, statistically meaningful by defini- 
tion. The key idea is to decompose the toric ideal of the pi model using ideals which are 
known and easier to understand. Namely, ignoring the normalizing constants Aj, reveals 
a connection between p\ models and toric varieties which are associated to certain graphs. 
These have been studied by the commutative algebra and algebraic geometry community, 
specifically Villarreal [20] and Ohsugi and Hibi [5]. We will use this connection to explain 
the form of Markov bases of p\ models. In terms of ideal generators for the pi model, re- 
introducing the normalizing constants does add another level of difficulty. However, in terms 
of moves on the network, we can avoid this difficulty by exhibiting the decomposition of the 
moves (although inapplicable in terms of ideal generators) using well-understood binomials 
arising from graphs. This approach reduces the complexity and size of Markov moves. In 
addition, it allows us to bypass the study of those basis elements are not applicable due to 
the constraints described above. 



3.1. The toric ideal of the pi model. Recall that the model consists of the points of the 
probability simplex that are in the row space of the design matrix. It follows that our model 



ALGEBRAIC STATISTICS FOR A DIRECTED RANDOM GRAPH MODEL WITH RECIPROCATION 7 



is, in fact, the positive part of the variety that is (the closure of) the image of the map (p n . 
For more discussion on the geometry of the model, see [T5] . 

To understand the model, we ask for all the polynomial equations that vanish on all of 
the points of the model; this set of equations is the defining ideal of the variety. In the case 
of log-linear models, the ideal is a toric ideal (see [19] for a standard reference). It can be 
computed from the kernel of the design matrix M: 

I M = ( p u -p» :u -v E kernel(M)) , 

where p u denotes a monomial. Any generating set of the defining ideal Im gives a Markov 
basis for our model. This is the Fundamental Theorem of Markov Bases (see [7], or [8], 
Theorem 1.3.6). It describes the relations among the Pij{*, •), and can be used for a random 
walk in any given fiber consisting of the points with the same minimal sufficient statistics, 
and thus to compute the exact distribution of the model for goodness-of-fit purposes. Note 
that the sufficient statistics, in this case, are the in- and out- degree distributions of the 
nodes. 

In order to enumerate all networks with the same degree distributions, one might want to 
use a Grobner basis instead. A Grobner basis is a generating set of the ideal, usually non- 
minimal, with some special structure desirable to have for computations. It is guaranteed 
to connect all points in a given fiber; every Grobner basis is a Markov basis. A Grobner 
basis can be made unique by requiring it to be reduced. There are finitely many reduced 
Grobner bases, and the union of all of them is contained in the set of primitive binomials, 
called the Graver basis of the ideal. The Graver basis is generally very hard to compute, but 
sometimes easier to describe algebraically, and its structure naturally implies constraints on 
the structure of the minimal Markov moves. 

Our first goal is to understand the structure of these Markov bases for the three cases of 
the pi model. Even though their size grows rapidly as we increase the number of nodes, there 
is quite a lot of structure in these generating sets. In what follows, we will first illustrate 
this structure on some small networks. 

Let us first fix some notation for the remainder of the paper. Since there are three cases of 
the pi model, we need three different names for the design matrices of the n-node network. 
The design matrix depends on the choice of n and pif 

(1) For the = 0, when the reciprocal effect is zero, the design matrix for the 
n-node network will be denoted by Z n . 

(2) For the case of constant reciprocation, i.e. = p, the n-node network matrix will 
be denoted by C n . 

(3) When reciprocation is edge-dependent, i.e. pij = p + pi + pj, the design matrix will 
be denoted by £ n . 

3.2. Markov bases of small networks. 

3.2.1. Case I: no reciprocation, (p^ = 0) 

This is clearly a special case of p^ = p, but we treat it here separately. We will see that, 
algebraically, it is interesting in its own right. 
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Let's start with the simplest nontrivial example: n = 2. The design matrix 



Z, 



encodes the parametrization <p 2 of the variety. Its rows are indexed by parameters as in- 
dicated, while the columns are indexed by pi 2 (0,0), pi 2 (l,0), Pi2(0, 1), Pi2(l, 1)- The ideal 
Iz 2 = (Pi2(l; 0)pi 2 (0, 1) — pi 2 (l, l)pi2(0, 0)) is the principal ideal generated by one quadric, 
and thus this single binomial is a Markov basis, and also a Grobner basis with respect to 
any term order. This can be verified by hand, or using software such as 4ti2 [TJ. 

Remark 3.1 (Prom binomials to moves). In general, we can translate binomials to 
moves in the following way: we will remove all edges that are represented by the py's in 
the negative monomial, and add all edges represented by the Py's in the positive monomial. 
Note that if Py(0, 0) occurs in either, it has no effect: it says to remove or add the "no-edge", 
so nothing needs to be done. However, there is a reason why Pij(0, 0)'s show up in the ideal: 
the structure of Z n for any n requires that each binomial in the ideal is homogeneous with 
respect to the pair Here, for example, since the positive monomial is of degree two, 

the negative monomial has pu(0, 0) attached to it to ensure it also is of degree two. 
Thus, the generator of Iz 2 represents the following Markov move: 
Delete the bidirected edge between 1 and 2, 
and replace it by two edges: (1,2) and (2,1). 

However, if we would like to require at most one edge per dyad, then this binomial is 
meaningless and there aren't really any allowable Markov moves. Logically, the case of no 
reciprocation somehow contradicts this requirement, since if = 0, we always get that a 
bi-directed edge between two nodes is valued the same as two edges between them. Thus 
the assumption of only one edge per dyad makes this problem so much more complicated, 
as relations like this one for any dyad in an ra-node network will appear in the generating 
sets of the ideal Iz n , but we will never want to use them. 

Next, let n = 3. The toric ideal Iz 3 is minimally generated by the following set of binomi- 
als: p 23 (0, l)p 23 (l, 0)-p2s(l, 1)P2 3 (0, 0), p 13 (0, l)p 13 (l, 0)-p 13 (l, l)pi 3 (0, 0), p 12 (0, l)pi 2 (l, 0)- 
Pi 2 (l, l)Pi2(0, 0), pi 2 (0, l)pi 3 (l, 0)p 23 (0, 1) -Pi2(l, 0)pi 3 (0, l)p 23 (l, 0). It is interesting to note 
that the first 3 generators are precisely the binomials from Iz 2 for the three dyads {1,2}, 
{1, 3}, and {2, 3}. The only statistically meaningful generator is the cubic. It represents the 
following move: 

Replace the edges (1,2), (2,3), (3,1) by (2,1), (3,2), (1,3). 

Graphically, it represents the three-cycle oriented two different ways: the positive monomial 
represents the cycle (1, 3, 2, 1), while the negative monomial represents its inverse, (1, 2, 3, 1), 



as depicted in figure 3.1 



Suppose now that n = 4. A minimal generating set for the ideal Iz 4 consists of 151 
binomials: 6 quadrics, 4 cubics, 93 quartics and 48 quintics. Some of these violate the 
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Figure 3.1. A degree-three move on 3 nodes: dashed edges are replaced by 
full edges. 

requirement that each dyad can be observed in only one state. As it is impractical to 
write all of these binomials down, we will list just a few of those that are statistically 
meaningful (i.e. respect the requirement of at most one edge per dyad at any time). 
As expected, the quadrics and the cubics are simply the generators of Iz 3 for the four 
3-node subnetworks of the four-node network. As we've seen, the quadrics are not of 
interest. The cubics represent the three-cycles. Here is a list of sample quartics, writ- 
ten in binomial form: p 12 (l, l)p 34 (l, l)P2s(0, 0)p 14 (0, 0) - p 12 (0, 0)p 34 (0, 0)p 23 (l, l)Pu(l> 1), 
p 23 (l, l)p 14 (l, l)p 13 (0, 0)p 24 (0, 0) - p 23 (l, 0)p 14 (l, 0)p 13 (0, l)p 24 (0, 1), 
p 23 (l, l)p 14 (l, l)pi 2 (0, 0)p 34 (0, 0) - p 12 (l, 0)p 23 (l, 0)p 34 (l, 0)pu(0, 1), 
Pi 2 (0, 0)p 23 (l, l)p 34 (0, l)p 14 (l, 0) - p 12 (l, 0)p 23 (l, 0)p 34 (l, l)p 14 (0, 0). 
Finally, we list some representative quintics: 

Pi 2 (0, 0)p 23 (l, l)p 34 (0, l)pi 4 (0, l)p 24 (l, 0)-p 12 (0, l)p 23 (l, 0)p 34 (l, l)p 14 (0, 0)p 24 (0, 1), p 12 (l, 0)p 23 (l, 0)p 14 (0, ( 
Pi2(0,l)p 23 (l,l)pi 4 (l,0)p 13 (l,0)p 24 (0,0). 

This set of Markov moves is quite more complex then the 10 moves originally described by 
Holland and Leinhardt for the 4-node case. We will postpone any further analysis of these 
binomials until the next section. For now, let us note that all of them preserve the in- and 
out- degree distributions of the nodes in the network. After we study the other two cases 
for pij, we will see a recurring underlying set of moves which can be used to understand the 
ideals. 

3.2.2. Case II: constant reciprocation, (p^ = p) 

Now we introduce one more row to the zero-p design matrix Z n to obtain the constant-/? 
matrix C n . Namely, this row represents the constant p added to those columns indexed by 
Pij (1,1) for all i,j G [n]. It is filled with the pattern 0,0,0,1 repeated as many times as 
necessary. For example, the design matrix for the 2-node network is as follows: 
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In this case the ideal is empty (there is nothing in the kernel of C 2 ), which is expected since 
the only relation in the case of p^ = required that there is no reciprocation. Here, the 
bidirected edge is valued differently then the two single edges in a dyad; this is the meaning 
of the last row of the design matrix. 
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For the 3-node network, the Markov move consist only of the cubic from the case p t j = 0: 
Pi 2 (0, l)pi 3 (l, 0)^23(0, 1) — ^12(1, 0)pi 3 (0, 1)^23(1, 0). On a side note, even the Graver basis 
consists only of this move and 15 other non-applicable moves (those which ignore the single- 
edged dyad assumption). 

Let n = 4. The software 4ti2 outputs a minimal generating set of the ideal Ic 4 consisting 
of: 4 cubics, 57 quartics, 72 quintics, 336 binomials of degree 6, 48 of degree 7, and 18 of 
degree 8. Out of this set, the applicable Markov moves are the same as in the CclSG Pij = with 
a few degree-six binomials added, such as: Pn(®, 0)^13(1, l)pi 4 (l, l)p 23 (0, l)p 24 (l, 0)p 34 (0, 0) — 
p 12 (l, l)pi 3 (0, l)p 14 (l, 0)p 23 (0, 0)p 24 (0, 0)p 34 (l, 1). 

3.2.3. Case III: edge-dependent reciprocation, (p^ = p + Pi + Pj) 

To construct the design matrix £ n for this case, we start with the matrix C n from the case 
Pij = p, and introduce n more rows indexed by p 1 , . . . , p n . Every fourth column of the new 
matrix, indexed by ^(1,1), has two nonzero entries: a 1 in the rows corresponding to p^ 
and pj. For example, when n = 2, the matrix looks like this: 



"1 


1 


1 


r 


^12 





1 





1 










1 


1 
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1 
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pi 











i_ 


P2 



This is a full-rank matrix so the ideal for the 2-node network is empty. 
With n = 3we get the expected result; the ideal Ig 3 is the principal ideal 

I £s = (p 12 (l, 0)p 23 (l, 0)p 13 (0, 1) - p 12 (0, l)p 23 (0, l)p 13 (l, 0)). 

With n = 4 we get the first interesting Markov moves for the edge- dependent case. The 
software 4ti2 outputs a minimal generating set of the ideal I £i consisting of: 4 cubics, 18 
quartics, and 24 quintics. The cubics, as usual, represent re-orienting a 3-cycle. Similarly 
some of the quartics represent 4-cycles. And then we get a few more binomials, of the 
following types: 

Pis(0, 0)p 24 (0, 0)p 14 (0, l)p 23 (0, 1) - p 13 (0, l)p 24 (0, l)pi 4 (0, 0)p 23 (0, 0) of degree four, and 
Pis(0, 0)p 24 (0, 0)p 14 (0, l)p 12 (l, 0)p 23 (l, 0) -pi 3 (l, 0)p 24 (0, l)p 14 (0, 0)p 12 (0, l)p 23 (0, 0) of degree 
five. Note that these two are just representatives; we may, for example, replace every Pij(0, 0) 
in each of them by 1), and get other Markov moves which are minimal generators of 

the toric ideal 

3.3. From the pi model to an edge subring of a graph. A careful reader will have 
noticed a pattern in the moves that have appeared so far. To that end, let us single out two 
special submatrices that appear in the design matrices in each of the three cases. 
1) First, for each case, we consider the matrix of the simplified model obtained from the 
Pi model by simply forgetting the normalizing constants A^- . Let us denote these simplified 
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matrices by Z n , C n and S n . Note that ignoring A^'s results in zero columns for each column 
indexed by Pij(0, 0), and so we are effectively also ignoring all Pij(0, 0)'s. Hence, the matrices 
Z n , C n and E n have (") l ess rows and (") less columns than Z n) C n and 8 n , respectively 
2) The second special matrix will be denoted by A n and is common to all three cases. It is 
obtained from Z n , C n or £ n by ignoring the columns indexed by 1) for all i and j, and 
then removing any zero rows. 

While these constructions may seem artificial at a first glance, we will soon see that they are 
helpful in effectively describing Markov bases for pi model for any n. 

Let us consider an example. The ideal of is generated by the four cubics representing 
the 3-cycles, and six quadrics, each of which represents the following move for some choice 
of i,j,k,l e {l,...,4} : 

Replace the edges (i,j) and (l,k) by the edges (i,k) and (l,j). 

Graphically, the move simply exchanges the heads of the directed arrows while keeping the 



tails fixed, as illustrated in Figure 3.2 



Figure 3.2. A degree- two move: dashed edges replaced by full edges. 

It turns out that these moves are basic building blocks for the Markov moves for the 4- 
node network, decorated with some other edges to ensure homogeneity, and sometimes this 
decoration can be quite non-obvious. They depend on the homogeneity requirements which 
are there for all three cases of pi, but also on the way that bi-directed edges might interact 
with "regular" edges, specially in the case of no reciprocation. In particular, we will see 



(Theorems 3.11 , 3.15) that the ideals of the simplified models are a sum of the ideal and 



another nice toric ideal. It will then follow that the ideal of our model is a multi-homogeneous 



piece of the ideal of the simplified model (Theorem 3.21). Equivalently, the corresponding 
varieties are obtained by slicing the simplified-model varieties with hyperplanes. The upshot 
of the decomposition is that the Markov moves can be obtained by overlapping simple moves. 

Example 3.2. The following binomial is a generator of the ideal Iz A '■ 

Pi 2 (l, 0)p 13 (l, l)p 23 (l, 0)p 24 (l, 0) - Pl2 (0, l)p 13 (l, 0)p 14 (l, 0)p 23 (l, i). 

The move itself, is equivalent to performing a sequence of two simple moves, as illustrated 



in Figure 3.3 



replace the cycle (1,2,3,1) by the cycle (1,3,2,1) , 
followed by 

replace the edges (1,3) and (2,4) by the edges (1,4) and (2,3). 

This "decomposition" depends on the fact that reciprocation is zero, so that the double edge 
is valued the same as two regular edges. 
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Figure 3.3. A sequence of two moves on 4 nodes: dashed edges are replaced 
by full edges. 



The following example illustrates that not all Markov moves are obtained in the same 
fashion. 

Example 3.3. Consider the case of edge-dependent reciprocation on n = 4 nodes. The 
following degree-five binomial appears as a minimal generator of the ideal i£ 4 , for a choice 
of 1 < i, j, k,l < n: 

Pij (l, 0)p ifc (0, 0)p«(0, l)p,- fe (l, 0) Pjl (0, 0) - Pij (0, l) Pik (l, 0) Pil (0, 0) Pjh (0, 0) Pj i(0, 1). 
Clearly this move can be obtained by the following sequence of simple moves, illustrated in 



Figure 3.4 



Replace the edges (l,i) and (j,k) by the edges (l,k) and (j,i), 
followed by 

replace the edges (i,j) and (l,k) by the edges (i,k) and (l,j). 




Figure 3.4. A sequence of two moves on 4 nodes: dashed edges are replaced 
by full edges. 

Note that we do not remove or add the empty edge represented by P ij(0,0); but these 
variables are required by homogeneity. 

3.4. The toric ideal of the common submatrix. (A n ) 

Focusing on the submatrix A n reveals additional structure which can be studied using 
some standard algebraic techniques. To that end, we recall the following standard definition 
(see |20J). 

Definition 3.4. Let k be any field (e.g. k = C), G be any graph and E(G) the set of its 
edges. If we consider the vertices of G to be variables, then the edges of G correspond to 
degree-two monomials in those variables. The ring 

k[G] :=k[xy\ (x,y)eE(G)} 

is called the monomial subring or the edge subring of the graph G. Its ideal of relations is 
called the toric ideal of the edge subring. 
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We will be interested in special graphs. Let G n := K n<n \{(i,i)}™ =1 be the complete 
bipartite (undirected) graph K n n on two sets of with n vertices each, but with the vertical 
edges (i, i) removed. If we label one set of vertices aci, . . . , a n and the other set pi, ... , (3 n , 
then our graph G n has edges («j, fy) for all i ^ j. Thus we are interested in the ring 
k[G n ] := k[ai(3j \ (cti, (3j) is an edge of G n }. 

Lemma 3.5. Adopting the notation above, is the toric ideal of k[G n }. 

Proof. Returning to our definition of A n , we see that it is the incidence matrix of the graph 
G n ; that is, the columns of the matrix correspond to the exponents of the monomials repre- 
senting the edges of G n . Thus the claim readily follows by the definition of the toric ideal of 
k[G n ] from Section 8.1. in [20]. □ 

We will use this correspondance to obtain a Grobner basis of Iz n and Ig n . But first, let 
us introduce one more concept which is crucial to the description of a Grobner basis of J_4 n . 

Definition 3.6. Following [5 J and [20], we say that a binomial f c arises from a cycle c of 
Gn if: 

1) c is a cycle of the graph G n ; namely, c is a closed walk on a subset of the vertices of G n 
such hat no vertex is visited twice: c = (a^, f3j 1 , aci 2 , {3j 2 , . . . , {3j k , a^), 

2) f c is the binomial obtained by placing the variables corresponding to the edges of the cycle 
c alternately in one and then the other monomial of f c ; that is, f c := /+ — /~ with /+ : = 
p ilh (l, 0)p i2j2 (l, 0) . ..pi^^il, 0)p iijk (l, 0) and /" := p i2jl (l, 0)p i3h (l, 0) . . .p ihjk _ 1 (l,0)p iljh (l 
where for ease of notation we have let 0) = Pji(0, 1) if i > j. 

There are finitely many cycles in G n , though they may be nontrivial to enumerate. How- 
ever, we will use this description to provide (theoretically) an explicit Grobner basis for our 
toric ideal. In practice, one can use the program 4ti2 to obtain the binomials fairly quickly. 
Grobner bases, and even Graver bases, for the ideals lj^ n are known ([5], [20J ) : 

Theorem 3.7 (Bases of the ideal of the common submatrix). Let Q n be the set of binomials 
arising from the cycles of the graph G n . Then Q n is a Grobner basis of Ijs, n . Thus it is also 
a Markov basis of I^ n , but not necessarily a minimal one. Moreover, this set is a Graver 
basis of the ideal, and it coincides with the universal Grobner basis. 



Proof (outline). We will use Lemma 3.5 and the appropriate results from [5] and [20] about 
the toric ideals of bipartite graphs. Note that there is quite a bit of vocabulary used in the 
cited results, but we have only defined those terms which we use in our description of the 
Grobner basis in the Theorem. 

Oshugi and Hibi in [5] (Lemma 3.1.), and also Villarreal in [20J (Proposition 8.1.2) give a 
generating set for I^ n . Moreover, Lemma 1.1. of [5] implies that these generators are in fact 
the Graver basis for Ij^ n . On the other hand, from Chapter 8 of [20] we see that precisely 
these generators in fact correspond to circuits, as well as the universal Grobner basis for 
J_4 n . It follows that the circuits equal the Graver basis for our ideal J_4 n . Avoiding technical 
details, we will just state that circuits are a special subset of the Graver basis: they are 
minimal with respect to inclusion. The binomials in the Graver basis are given by the even 
cycles of the graph G n (Corollary 8.1.5. in [20]). But since our graph is bipartite, all of the 
cycles are even (Proposition 6.1.1. in [20J ) , so it suffices to say that the Graver basis of 
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consists of binomials arising from the cycles of the graph G n . Since the universal Grobner 
basis (which equals Graver basis in this case) is by definition a Grobner basis with respect 
to any term order, the proof is complete. The binomials arising from the cycles of the graph 
G n form the desired set Q n . □ 

In addition, we have a nice recursive way of constructing the binomials in Q n . 

Proposition 3.8. The set Q n consists of precisely the binomials in Q n -\ for all of the n — 1- 

node subnetworks, together with the binomials arising from the cycles of the graph G n which 
pass through either ol^ or fa for each % between 1 and n. 

Proof. The last condition can be restated as follows: we are looking for the primitive bino- 
mials / such that for each node i in the random network, there exists an edge such that 
the variable Pij(a, b) appears in one of the monomials of /. The reduction to the n — 1-node 
subnetworks is immediate from Proposition 4.13. in |19J . Namely, the design matrices for 
the n — 1-node networks are submatrices of A n , hence the Graver basis of Ia„-i equals that 
of I_A n involving at most n — 1 nodes. □ 

An example of a degree 5 binomial in J^ 5 which uses all 5 nodes is 

Pu(l, WO, l)p 23 (l, 0)p 24 (0, l)p 35 (l, 0) - p 14 (0, l)p 15 (l, 0)p 23 (0, l)p 24 (l, 0)p 35 (0, i). 

The first term represents a cycle (1,4,2,3,5), while the second term represents its inverse, 
(1, 5, 3, 2, 4). In terms of the graph G 5 , each of the monomials represent edges, in alternating 
order, of the cycle (a 1; /3 4 , a 2 , /3 3 , «5, A, «4, 02, «3, As, «i)- 

Remark 3.9. In the Proposition above, all binomials have squarefree terms. This means 
that the initial ideal of the toric ideal is squarefree. Using a theorem of Hochster (|14j). this 
implies that the coordinate rings of the corresponding varieties are arithmetically Cohen- 
Macaulay. This is a powerful algebraic-geometric property that is not encountered too 
often. 

Remark 3.10. There is a special property of the design matrix, called unimodularity which, 
if satisfied, implies that all initial ideals or its toric ideal are squarefree, and also that circuits 
equal the Graver basis. In general, for the design matrix of the p\ model it does not hold. 

We are now ready to embark on a more detailed study of the simplified models. 

3.4.1. Case I: no reciprocation, (py = 0, simplified model) 

Let's start with the simplest nontrivial example: n = 2. The rows of the design matrix Z2 
are [1, 0, 1], [0, 1, 1], [0, 1, 1], [1, 0, 1], [1, 1, 2], and are indexed by 04, a 2 , (3\, /3 2 , and 9, while 
the columns are indexed by pi 2 (l, 0), pi 2 (0, 1), pi 2 (l, 1). One easily checks that the ideal Ig 2 
is the principal ideal 7j 2 = (pi 2 (l, 0)pi 2 (0, 1) — pi 2 (l, 1)) and thus this single binomial is a 
Markov basis, and also a Grobner basis with respect to any term order. 

Next, let n = 3. The toric ideal Jj 3 is minimally generated by the following set of bino- 
mials: p 23 (0, l)p 23 (l, 0) - p 23 (l, 1), p^O, l)p 13 (l, 0) - p 13 (l, 1), p 12 (0, l)p 12 (l, 0) - p 12 (l, 1), 
Pi 2 (0, l)pi 3 (l, 0)p 23 (0, 1) — pi 2 (l, 0)pi 3 (0, l)p 23 (l, 0). It is interesting to note that the first 3 
generators are the "trivial" ones (as seen in case n = 2). 
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For the network on n = 4 nodes, we get the first interesting Markov basis elements. 
Namely, there are inhomogeneous binomials of degree 2 which represent the "obvious" rela- 
tions, which are of the form Pij(Q, l)p^(l, 0) — Pij(l, 1). Next, there are squarefree quadrics 
(homogeneous binomials of degree 2): pik(l,0)pji(l,0) — pu(l,0)pjk(l,0), and there are 
degree-three binomials that resemble the degree-three generator of the ideal Jj 3 : 
Pij(0, l)pifc(l, 0)p jk (0, 1) 0)Pife(0, l)Pjk(l, 0). Note that if we write p jk (l, 0) and j > k, 

then we mean pkj(0, 1), since for all Pj k (a, b) we assume j < k. 

In order to obtain a Grobner basis for the ideal I? , we can use what we know about 
I_A. n . The advantage of studying over Ij is that it is a homogeneous ideal; that is, both 
monomials in each binomial appear with the same degree. One can see that the ideal is 
homogeneous by inspecting the columns of the matrix: each column has the same 1-norm. 
The ideal of the simplified model admits a nice decomposition: 

Theorem 3.11 (Decomposition of the ideal of the simplified model Z n ). With the nota- 
tion as above, Ij? = Ij^ + T , where T is the ideal generated by the binomials of the form 
Pij(0, l)pij(l, 0) — Pij(l, 1) for all pairs of nodes i < j. 

Proof. One inclusion (d) is clear. To prove the other inclusion, consider / e 7j such 
that Pij(l, 1) divides one of its terms for some Since the ideal is toric, it suffices to 
consider the case when / is a binomial. Then it can be written as / = Pij(l, l)m\ — rri2 
where m\ and m 2 are monomials. But then we can write / = (pij(l,0)pij(0, l)rrii — m 2 ) — 
mi(pij(0, l)pij(l,0) — Pij(l, 1)). Repeating this procedure if necessary, we can write / as a 
combination of binomials whose terms are not divisible by any of the variables Pij(l, 1) and 
those binomials that generate T. To conclude, note that those not divisible by any of the 
Pi j (1,1) are in the ideal by definition. □ 

Combining the above results, we obtain a Markov (Grobner) basis for the ideal of the 
simplified model: 

Theorem 3.12 (Grobner basis for the simplified model Z n ). Let Qt be the binomial gener- 
ators of T , that is, Qt '■= {Pij(0, l)pij(l, 0) — Pij(l, 1)} for all pairs of nodes i < j. Let Q r , 



be the set of binomials arising from the cycles of G n , as in Theorem (3.7). Then the ideal 
of the simplified model 1% has a Grobner basis, and thus a Markov basis, consisting of the 
union of the binomials in Q n and Qt- 

Proof. First we will show that the set Qt actually forms a Grobner basis for T. Namely, pick 
an elimination order where the variables Pij(l, 1) are greater then the remaining variables. 
Then, the degree-one terms of the binomials in Qt are the initial terms. Since they are all 
relatively prime, the given generators form a Grobner basis as all S-pairs reduce to zero (for 
details about Grobner basis computations, the reader should refer to Buchberger's criterion 
[3] as well as standard references [6] and [T9]). 

Next, take any Grobner basis Q n for the ideal 7_4 n with respect to some order -<. According 



to Lemma 3.11, Q n U Qt generates the ideal . Let -<' be the refinement of the term order 
-< by the elimination order used for Qt- The we get that Q n U Qt is in fact a Grobner basis 
for Ij, with respect to -<', since no initial term in Qt appears as a term in Q n and thus all 
the S-pairs will again reduce to zero. The Grobner basis Q n for the ideal follows from 



Theorem (3.7). □ 
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To derive minimal Markov bases for the simplified model, it remains to find a minimal 
generating set for the ideal of the edge subring of an incomplete bipartite graph. To the 
best of our knowledge, there isn't a result that states these are generated in degrees two and 
three. In particular, we claim: 

Conjecture 3.13. A minimal Markov basis for the ideal 1% consists of the elements of 
degrees 2 and 3. All inhomogeneous elements are of the form Pij{0, l)py(l, 0) — p„(l,l). 
All quadrics are of the form Pi k (l, 0)pji(l, 0) — pu(l, 0)pj k (l, 0), where i, j, k, I vary over 
all possible four-node subnetworks. All cubics are of the form p i j(0,l)pn ;: {l,0)pjk{0,l) — 
Pij(l,0)pik(0,l)pjk{l,0), where i, j, and k vary over all possible triangles (3-cycles) in the 
network. 

3.4.2. Case II: constant reciprocation, (pjj = p, simplified model) 

There is a small but crucial difference between Z n and C n : one more row, representing 
the constant p added to those columns indexed Py(l, 1) for all i,j G [n]. This row is filled 
with the pattern 0, 0, 1 repeated as many times as necessary. Note that adding the extra 
row makes this ideal homogeneous, that is, the two terms of each binomial have the same 
degree. (Note that homogeneity is easy to verify: the dot product of each column of C n with 
the vector w = [1, . . . , 1, — n — 1] results in the vector [1, . . . , 1], as required, for example, by 
Lemma 4.14. in [IS].) For two nodes, the ideal is trivial. For n = 3 nodes, the Markov ba- 
sis consists of 4 binomials of degree 3 of the following forms: Pifc(l, 0)pi k (0, l)pj k (l, 1) — 
p ik (l,l)p jk (l,0)pjk(0, 1), Pij(l, 0)pi k (0,l)p jk (l,0) - pij(0, l)p ik (l,0)p jk (0, 1), and 6 of de- 
gree 4 of the form: p^ (0, l) 2 p ifc (l, l)p jfc (0, 1) - Pij(l, l)p ik (0, l) 2 p jk (l, 0). For n > 4, the 
Markov bases consist of elements of the above type, but also include quadrics of the form: 
p ik (l, 0)pji(l, 0) - p u (l, 0)p jk (l, 0), p fi (l, 1)ph(1, 1) - Pifc(l, l)Pji(l, 1). 

Conjecture 3.14 (Minimal Markov basis of the simplified model C n ). The ideal is 
generated in degrees 2, 3, and 4, and the binomials are of the form described above. 

Note: due to the existence of p^-, we do not get the relations Qt from the case pjj = 0. 
Recall they were of the form Pij(l, 0)py(0, 1) — p^ (1, 1). They cannot be in the ideal in this 
case, since the bi-directed edge between i and j is valued differently then the two single edges 
and (J, i). Grobner bases or even Markov bases in this case remain an open problem 
that we will continue to study. 



3.4.3. Case III: edge-dependent reciprocation, (pij = p + pj + pj, simplified model) 

To construct the design matrices Z n for this case, we start with the matrix C n from the 
case pij = p, and introduce n more rows indexed by pi, . . . , p n . Every third column of the 
new matrix, indexed by py(l, 1), has two nonzero entries: a 1 in the rows corresponding to 
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pi and pj. For example, when n 



3, the matrix 


is: 








"1 


1 


1 





1 








0" 


1 


1 











1 





1 











1 


1 





1 


1 


1 


1 





1 


1 











1 


1 














1 


1 








1 





1 


1 





1 


1 1 


2 


1 


1 


2 


1 


1 


2 





1 








1 








1 





1 








1 














1 

















1 














1 








1 



The kernel of this matrix is generated by one vector, and in fact the ideal Ig 3 is the principal 
ideal Ig 3 = (p 12 (l,0)p 23 (l,0)p 13 (0,l) - pi 2 (0, l)p 23 (0, l)pi 3 (l, 0)). When we calculate the 
ideals for larger networks, some more familiar binomials appear. 

Theorem 3.15 (Decomposition of the simplified model £ n ). Let Ij^ 

3.4 Let Q be the ideal generated by the quadrics Pij(l, 1)pm( 



1,1 



be the ideal as in Section 
p ik (l, l)pji(l, 1) for each 



set of indices 1 < i,j, k,l < n. Then Ig = I^. n + Q for every n > 4. 

Proof. Consider the submatrix of £ n consisting of those columns that have only zeros in the 
last n rows. Erasing those zeros, we get precisely the columns of A n . Thus Ig D I^ n . 

Similarly, the ideal of relations among those columns that have nonzero entries in the last 
n rows is also contained in Ig . We will now show that this ideal equals Q. To simplify 
notation, let M be the matrix consisting of the columns in question, that is, those that are 
indexed by Pij (1, 1) for all pairs i < j. Recalling the definition of the action of the simplified 
parametrization map on p i3 -(l,l): ^-(1,1) i-» aif3jaj(3i9ppjPj, and the fact that the last n 
rows of M are indexed by pi, ■ ■ ■ ,p n , we see that to study the toric ideal Im it suffices to 
study the ideal Im 1 where M' is the submatrix consisting of the last n rows of M. But Im 1 
is a well-studied toric ideal! Namely, M' agrees with the incidence matrix of the complete 
graph K n on n vertices: for each pair {i,j} there exists an edge Therefore the toric 

ideal Im 1 agrees with the toric ideal of the edge ring k[K n ], where the vertices of K n are 
labelled by the nodes 1, . . . ,n, and an edge between i and j represents the variable Pij(l, 1). 
It is well-known (see, for example, [5]) that Im> is generated by quadrics. Moreover, from [I] 
and references given therein (see also Corollary 9.2.3. in |20j) it follows that these quadrics 
be interpreted as 2 x 2-minors of a certain tableaux. But our definition of the ideal Q is 
precisely the set of those 2 x 2-minors, thus Im 1 = Q and we obtain Ig D Q. 

To complete the proof, we need to show the reverse inclusion. To that end, let / = 
/ + — f~ G Ig . If no Pij(l, 1) divides either term of /, then / G Ij± n and we are done. On the 
other hand, suppose Pij(l, for some pair i,j. Then the definition of Ig implies that 

one of the following conditions are satisfied: 

1) Pij(l, But then the binomial / is not primitive, and thus it cannot be required in 
any minimal generating set, or a Grobner basis, of Ig . 

2) pfcj(l, for some other pair of indices k, I. But this in turn implies that 
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i) Pij(l, l)pki(l, l)\f~ and / fails to be primitive trivially; or 

ii) without loss of generality, Pik(l, l)pji(l, and / fails to be primitive by the quadric 
in Q ; or 

iii) / + is divisible by another variable p s t(l, 1), and the pattern continues. 

In general, it is clear that any / whose terms are divisible by the variables representing 
columns of M will fail to be primitive by one of the binomials in the ideal Q. Therefore, the 
ideal Ig is generated by the binomials of and Q, as desired. □ 



Next we obtain a Grobner basis for our toric ideal. Recall from Section |3.4| that G n = 
K ntn \{(i,i)}^ =1 . We have seen that the graph G n played an essential role in the case when 
Pij = 0, and it will play an essential role here as well, since it is essential in describing the 
Grobner bases of the common submatrix A n . However, in order to study the Grobner basis 
of the ideal Q, we need to use graphs with more edges. It comes as no surprise, then, that 



these ideals will have a more complicated Grobner basis then those from Section ^4 Thus 
we need to generalize Definition |3.6| for the complete graph K n on n vertices: 

Definition 3.16. As before, following [5] and [20], we say that a binomial f w arises from 
an even closed walk w of K n if: 

1) w is an even closed walk w of K n ; namely, w is a closed walk on a subset of the vertices 
of K n such that it is of even length (even number of edges) w = ({vi,v 2 }, . . . {v2k-i, v 2h}), 
with 1 < v i, . . . , t>2fc < n. The closed condition requires that v 2 k = V\. 

2) f w is the binomial obtained by placing the variables corresponding to the edges of the walk 
w alternately in one and then the other monomial of f w . In other words, f w := f£~fw with 

fw - = Pvi,V2 (1) ^)PV3,V4 (1) 1) • • • Pv 2 k-l,Vl (1)1) an( l fw - = Pl>2,V3 (1) ^)Pv4,Vs (!)!)••• Pv 2 k-2,'"2k-l ^) 1 

where for compactness of notation we have let 1) = Pji(l, 1) if i > j. 

We say that the even closed walk w is primitive if the corresponding binomial is primitive. 

Let us state the characterization of such walks from [5]: 

Lemma 3.17 Lemma 3.2.). A primitive even closed walk on a graph G is one of the 
following: an even cycle of G; or two odd cycles having exactly one common vertex; or two 
odd cycles having no common vertex together with two more walks, both of which connect a 
vertex of the first cycle with a vertex of the second cycle. 



In Section 3^4, we have seen only even cycles. This is because the graph in question, 
namely G n , was bipartite, and therefore had no odd cycles. We are now ready to state the 
main result of this section. 

Theorem 3.18 (Grobner basis of the simplified model £ n ). The ideal Ig n admits a Grobner 
basis consisting of the binomials arising from the cycles of the bipartite graph G n together 
with the binomials arising from the primitive closed even walks of K n , the complete graph on 
n vertices. 

Proof. Recall that Q n are precisely the binomials arising from the cycles of G n . We have 



seen in Theorem 3.12 that Q n form a Grobner basis for J_4 n (in fact, they are a Graver basis) 



Since the ideals J_4 n and Q are in disjoint sets of variables, it remains to find a Grobner basis 



for Q. We do this by generalizing the argument in the proof of Theorem 3.12| Namely, from 



the proof of Theorem 3.15 we know that Q is the toric ideal of the edge ring k[K n ]. This 



ALGEBRAIC STATISTICS FOR A DIRECTED RANDOM GRAPH MODEL WITH RECIPROCATION 19 



allows us to use [5] and j2D] again, and we obtain (e.g. Lemmas 3.1. and 3.2. in [5j) that 
the Graver basis of Q consists of the binomials arising from the primitive even closed walks 
on K n . In addition, note that Theorem 9.2.1 of [20] provides a quadratic Grobner basis as 
well. □ 



A proof of Conjecture 3.13 would imply the following 



Conjecture 3.19 (Minimal Markov basis of the simplified model S n ). For n > 4, the ideal 
Ig is minimally generated by homogeneous binomials of degrees 2 and 3. More precisely, the 
degree 2 and 3 binomials in Q n , together with the quadratic generators of Q, form a Markov 
basis for the model. 

3.5. Prom the edge subring of a graph back to the pi model. We are now ready to 
introduce the \j back into the parametrization and consider the original design matrices 
Z n , C n and £ n . Recall that they can be obtained from Z n , C n and £ n , respectively, by adding 
(™) columns representing the variables Pij(0, 0) for all i < j and (™) rows indexed by X^. 

Example 3.20. Let n = 4. Then, for the case in which pjj = 0, the 15 x 24 design matrix 
Z4 is: 



1111 


000000001111 
000000000000 
0000000000000000 
0000000000000000 



00000000000000000 
11110000000000000 

000000000 

1 



1 1 1 





1 1 

















1 1 







1111 












0000000001 



1 1 
00000000 
1 
1 



01120112011201120112 
01010101000000000000 

1 1 









1 1 









1 



11 








1 1 








1 1 
110 11 



1 1 
1 1 

000000000101000001010101 



0000000000 



00000101000001 







The following analysis applies to both the case of constant pij and the case of edge- 
dependent pij, thus we will treat them simultaneously. 

There is a very concise way to describe the new toric ideal in terms of the cases when Ay- 
were ignored. For a binomial /, we will say that / is multi-homogeneous with respect to each 
pair {i,j} if the degrees in the variables indexed by the pair {i,j} agree for the two monomi- 
als of /. More precisely, if / = f + -f~, then we require that deg Pij(0 0) (/ + ) + deg py(1)0) (/ + ) + 

d %, y (o,i)(/ + )+ de gpy(i,i)(/ + ) = de gp 1J (o,o)(/~)+ de gp !J (i,o)(/")+ de g PlJ (o,i)(/ _ 
for each pair {i,j}- This allows us to make a simple observation. 

Proposition 3.21 (Geometry of the p\ model). The toric ideals Iz n 
model on an n-node network are precisely those parts of the ideals I- 



)+ d eg PlJ (i,i)(/ ) 



I Cn and I £n of the p x 
, In and Ie 



respec- 



tively, which are multi-homogeneous with respect to each pair {i, j}. Therefore, the toric 
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variety for the pi models is obtained by slicing the corresponding varieties of the simplified 
model by Q) hyperplanes. 

Proof. The rows indexed by Ay that are added to the matrices , Ig and Ig require 
precisely that the binomials in the ideal are multi-homogeneous according to the criterion 
given above. For example, consider the first row indexed by A 1,2- For any binomial / = 
pu+ _ pu - m ideal, the exponent vector u + — u~ being in the kernel of the matrix means 
that the number of variables pi,2( # , •) which appear in p u+ equals the number of variables 
Pi,2( # , •) that appear in p u . 

For the hyperplane section statement, note that the varieties defined by /j , Iq and Ig 
live in a smaller-dimensional space than those defined by Iz„ , Ic n > and Is n , respectively; but 
we may embed them in the natural way into a higher-dimensional space. The hyperplanes 
are defined by the rows indexed by the Ay's. □ 

In general, multi-homogeneous part of any given ideal is not generated by the multi- 
homogeneous generators of the original ideal. But for the ideal of the p\ model we are able 
to use homogeneity to decompose the Markov moves into "essential" parts. The upshot of 
this result is that analysis and computations become easier. In addition, all moves obtained 
this way are applicable and irredundant. 

Theorem 3.22 (Essential Markov moves for the p\ model). The Markov moves for the p\ 
model in the case of zero and edge- dependent reciprocation can be obtained from the Graver 
basis of the common submatrix A n , together with the Markov basis for the ideal Q as defined 
in Theorem \3.15[ 

Remark 3.23. The case of constant reciprocation is more challenging as we do not yet 
have a simple decomposition as in the other two cases. However, a similar argument can 
be used to claim that the Markov moves in case of constant reciprocation can be obtained 
by repeating the moves for the simplified model, while respecting the multi-homogeneity 
requirement. 



Before embarking on a proof of 3.22 let us make the claim more precise. Take q = 
q + — q~ G Jj or Ig in the ideal of the simplified model. The monomials q + and q~ are 
represented by directed edges in our n-node network. If q + and q~ represent the same cycle 
with different orientation, then q is multi-homogeneous already. On the other hand, suppose 

d d 
Q = YlPidk( a k,h) - Y[p Sk t k (c k ,d k ) 

k=l k=l 

where p ikjk (a k ,b k ),p Sktk (c k , d k ) E {p y - (1, 0),py(0, l),Py(l, 1)}, i < j, for 1 < k < d. Then we 
may define 

d d d d 

q = n p ^ ^ n p ^ (°> °) ~ n ^ ^ n ^ (°. °) • 

k=l k=l k=l k=l 

Also, one can modifiy q by taking k in a subset of {1, . . . , d}, as we may not need every k 
from 1 to d to make q multi-homogeneous. We call each such q a lifting of q, including the 
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case when q is lifted using less then d variables in each term. (Note: in addition, if not all 
of (dfc, bk) and (c&, dk) are (1, 1), we may lift q by Pf/(1, 1) instead of Py(0, 0).) 

It is clear that all of these lifts are in the toric ideal of the model. It is not clear that 
these are sufficient to generate it (or give its Grobner basis). In particular, it seems that 
there is another kind of lifting that needs to be included to generate the ideal of the model. 
Essentially, it involves overlapping two (or more!) minimal generators of the ideal in a 
special way. 

For example, consider the binomial 

Pia(l, 0)p 13 (0, l)j?i 4 (0, 1> 23 (1, 1) - pia(0, l)p i3 (l, l)p 23 (0, l)p 24 (0, 1), 

which is in the ideal for the model on n > 4 nodes. It is not of the form q, that is, it is 
not lifted in the nice way described above. However, it can still be obtained from binomials 
on the graph on 3 nodes. Let /+ = p 12 (l, 0)pi 3 (0, l)p 23 (l, 0), /" = p 12 (0, l)pi 3 (l, 0)p 23 (0, 1) 
and g + = p u (0, l)p 23 (0, 1), g~ = p 13 (0, l)p 24 (0, 1). Note that / = /+-/- and g = g + - g~ 
are in the simplified model ideal for n > 3 nodes. If we use the fact that in the graph 
the edges from node 1 to node 3 and from node 3 to node 1 combine to a double-ended 
edge between 1 and 3, and if we define f M g = f + g + — f~g~, then we obtain precisely 
Pi, 2 (l, 0)pi )3 (0, l)p M (0, l)p 2)3 (l, 1) - pi, a (0, l)pi, 3 (l, l)p 2 , 3 (0, l)p 2 , 4 (0, 1). We will call such an 
operation (f^g) an "overlap" of two binomials, since it corresponds to overlapping the edges 
of the graphical representations of / + — /~ and g + — g~ . Take a binomial / Mg in the ideal 
of the model Ig n . Note that it may happen that neither / nor g are in Ig n . But in terms of 
moves, f M g is equivalent to performing two successive moves: the one defined by /, and 
the one defined by g. In particular, binomial overlaps give rise to consecutive Markov moves 
which respect multi-homogeneity. 

Remark 3.24. Note that Q appears only in the decomposition for the ideal Ig , and not Jg . 
But for example the binomial pi 2 (l, l)p 3 4(l, 1)P2 3 (0, 0)pi 4 (0, 0)-pi 2 (0, 0)p 34 (0, 0)p 23 (l, l)pi 4 (l, 1) 
is a homogenization of a generator of Q, and it lives in the ideal J_g 4 - Homogenization by 
Py(0, 0) does not affect the move itself. 



Proof of Theorem 3. 22 , Clearly, all moves from and Q can be homogenized by lifting: 
that is, if q G Ij\, n , then q G Is n - The proof relies on a simple, yet crucial, observation that 
by definition, the ideal of the model is contained in the ideal of the simplified model; e.g. 

Let / G Ig n . Then / G Ig . If f is in the Graver basis of Ig , then we are conclude by 
Theorem 3.18 Alternately, assume / is not in the Graver basis of Ig . Then there exists a 



binomial p G Ig such that p + divides / + and p divides / . Equivalently, we can write / 
as / = p M f for / defined appropriately (e.g. / + := f + /p + ). Since we may assume that p 
is primitive, we can keep decomposing / until we obtain an overlap of k primitive binomials 
from the ideal of the simplified model. Note also that we may assume that in the end we are 
not in the case where f + = f~- Indeed, if / is a multiple of another binomial in the ideal, 
say, / = f + g + — f + g~ , then g + — g~ is also in the ideal. In terms of moves, multiples do 
not contribute anything: they instruct us to remove and add the same edge. 

Replacing S n by Z n does not change the above argument. □ 

Using these two constructions of lifting and overlapping, we make the following conjecture: 
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Conjecture 3.25. Minimal Markov (Grobner) bases for the pi models can be obtained from 
Markov (Grobner) bases of by repeated lifting and overlapping of the binomials in the 
minimal Markov bases of various (n — I) -node subnetworks. 

We close this section by remarking that these various liftings imply a (possibly non-sharp) 
degree bound on the Markov and Grobner bases for the model. For example, each lifting of 
the first stype will add at most (™) edges to each monomial of q thus increasing the degree by 
at most (™) from the degree needed for the simplified model, while overlapping k binomials 
will allow the degrees of generators to increase k times. Note that we have already seen lifts 



and overlaps in Examples 3.2 and 3.3 



4. Discussion 

We close with a discussion about the relationship between our parametrization and the log- 
linear parametrization suggested by P, \TU\ . Fienberg and Wasserman's parametrization of 
the p\ model encodes it as a n 2 x 2 x 2 contingency table, where the first variable corresponds 
to a dyad and the second and third represent the four dyadic configurations. Thus, a network 
is represented as a point x in IR 4n with 0/1 entries. This log-linear parametrization is clearly 
highly redundant, as, besides the Multinomial constraints on each of the n 2 dyads, there are 
additional symmetric constraints of the form Xij^,i = x j,i,k,u f° r ah hj G { 1, - - - , n} and k, I G 
{0, 1}. Although, as shown in [TT], these redundancies can be convenient when computing 
the MLE, they are highly undesirable for finding Markov bases. Indeed, the toric ideal 
corresponding to the this parametrization has 4n 2 indeterminates while our parametrization 
only contains 2n(n — 1). For example, when n — 5, this means the toric ideal lives in 
the polynomial ring with 100 indeterminates, instead of 40 that we have. In addition, 
the number of generators explodes combinatorially: for the case of constant reciprocation, 
pij = p, the ideal of the network on n = 3 nodes has 107 minimal generators, and the 
one of the 4-node network has 80,610. The case when n = 5 is hopeless: even 4ti2, the 
fastest software available to compute generating sets of toric ideals, cannot handle it. One 
wonders what all of these extra generators are! First of all, note that this contingency-table 
representation is highly redundant. Most of the Markov basis elements are inapplicable 
because of the product multinomial constraints and the symmetry constraints. Finally, the 
many symmetries in the table imply many highly non-trivial constraints that have to be 
accounted for when eliminating non-applicable moves. We were able to analyze the n = 4 
case and reduce all of the 80610 moves to the ones we get using the design matrices C n , 
but the effort was nontrivial. Therefore, at least from the point of view of studying Markov 
bases, the parametrization we are using in the present paper is preferable. 

Our hope is that this article motivates a deeper study of the algebraic statistical challenges 
for pi models and their extensions. 
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